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Abstract 

How to distribute welfare in a society is a key issue in the subject of distributional justice, which is deeply 
involved with notions of fairness. Following a thought experiment by Dworkin, this work considers a 
society of individuals with different preferences on the welfare distribution and an official to mediate the 
' coordination among them. Based on a simple assumption that an individual's welfare is proportional 

to how her preference is fulfilled by the actual distribution, we show that an egalitarian preference is a 
strict Nash equilibrium and can be favorable even in certain inhomogencous situations. These suggest 
how communication can encourage and secure a notion of fairness. 

Introduction 

The concept of distributive justice has been extensively studied in political philosophy and economics 
over the past few decades. One of the most important milestones in this field is Rawls' A Theory of 
Justice PP, which put forward equality as an outcome of the social contract that individuals behind a 
veil of ignorance should agree on. Even though egalitarian doctrine that all human persons are equal in 
fundamental worth or moral status is a commonly shared idea, egalitarianism turns out to be a contested 
concept. There have been several divergent understandings of the meaning of equality, ways to achieve 
equality, or the metric to measure equality [2][3]. For instance, someone who puts more emphasis on 
equality of opportunity may have a very different opinion from those who put emphasis on equality of 
incomes in spite of the overall agreement on the concept of equality per se. 

Among numerous dimensions where egalitarianism varies, the question of what should be equalized is 
one that many different theories are competing on; is it opportunity, capabilities, resource or welfare [2H6]? 
Dworkin, one of the most influential proponents of resource egalitarianism, admits the immediate appeal 
of the idea that it must ultimately be equality of welfare insofar as equality is important, and examines 
the logical consistency and practical applicability of this welfare egalitarianism [7] . According to Dworkin, 
welfare egalitarianism, concerned with equality in every person's overall satisfaction, has an inconsistency 
in its logic. For example, if one accepts the idea that those who are handicapped need more resources to 
achieve equal welfare, the same argument should apply to those who have expensive tastes for the same 
reason. However, one should immediately recognize that the appeal of welfare egalitarianism becomes 
much less strong in the case of expensive tastes than in the case of the handicapped. The fact that the 
same idea can be accepted in some cases and seems disturbing in other cases reveals a logical inconsistency 
of welfare egalitarianism. Dworkin also criticized welfare-based egalitarianism on that it inevitably relies 
on the possibility of interpersonal comparisons of utility, which places a large burden to a policy maker 
in practice. Lastly, Dworkin argues that it would probably prove impossible to reach a reasonable degree 
of equality in this conception in a community whose members held very different and very deeply felt 
political theories about justice in distribution. 
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The last point is the one that we focus on in this paper. We will show that the existence of con- 
tradictory political theories does not immediately lead to the impossibility but can be formulated as 
dynamics which admits a unique solution under certain assumptions. By doing this, we will argue that 
the reasoning in [7] can be regarded as a tool to analyze and advocate the idea of equality in welfare. 
To some extent, this is complementary to a previous work which argues that one can reach the idea of 
equality in welfare by starting from that of equality in resources [8]. Following the logic that Dworkin 
used when he showed the impossibility to reach an agreement on redistribution in terms of welfare, we 
also set aside the issues of logical inconsistency and inter-personal comparisons of individuals' welfare. 
To focus on the relationship between individual preferences and resulting welfare, we also set aside the 
issue of impartiality (see, e.g., [9j[T0]). We will further assume that individuals have preferences over 
the distribution of welfare among them [111112) . Many theoretical and experimental studies have shown 
that people are concerned with equality and fairness and often persist in fairness even when they lose 
monetary payoffs in doing so, e.g., in the public games, ultimatum games or dictator games. This be- 
havior cannot be explained based on the assumption of self-regarding preferences but of others-regarding 
preferences or social preferences. Our formulation in this paper can serve as a systematic description for 
such an approach, because it tells us a way to translate others' payoffs into an individual's with respect 
to individual preference. 

Model 

Let the concept of welfare be understood as the fulfillment of preferences [13] , including success in political 
preferences, i.e., opinions of how welfare should be distributed. Then the reasoning by Dworkin J7] argues 
that such an egalitarian society, where everyone is concerned with the equality, will end up with supporting 
non-egalitarians by its own logic. Suppose that a bigot enters an egalitarian society, with an opinion that 
some people deserve more than the others. This person will feel frustrated to see that her political 
preferences are not accepted by egalitarian neighbors, and her welfare becomes relatively lower than 
the others'. If there is an official committed to compensating for inequality in welfare, by reallocating 
resources for example, the bigot should get extra resources from the official due to her political frustration, 
because she does not support the egalitarian idea of the society. This is called Dworkin's paradox in this 
work. Particularly we note that it can serve as an idealized model to represent our understanding of a 
modern democratic society. We will look into this hypothetical society a little closer. 

Imagine a society of N persons and an official. The official, representing a social institution, exists to 
mediate the global coordination. We assume that the total amount of welfare to be distributed among 
the persons is fixed as unity, and that the welfare is infinitely divisible, since we are interested only 
in relative fractions rather than absolute amounts that individuals have. The official herself does not 
take part in sharing the welfare, but only receives the N persons' opinions and find a way to distribute 
the welfare among them. Let each person i have a certain preference about how the welfare should be 
distributed, say v, = (vn, Vi2, ■ ■ ■ , Vjiv) with J2j v ij = 1- We denote the actual welfare distribution as 
r = (7*1, ra, . . . rjsr). The person i's welfare is determined by the extent to which her preference is fulfilled. 
In other words, we consider an equation 

r i =J r (v i ,r) (1) 

with a certain function J- . which is assumed to equally apply to all the persons. We suppose that the 
official wants to announce a stable welfare distribution r such that each person's relative share remains 
unchanged after the announcement, which means that r solves Eq. (TTJ) self-consistently. It is important to 
note that the preferences reported by each individual are assumed to be true and available to the official 
at every moment, which helps us to focus on basic ideas of the paradox. A few remarks are in order. First, 
we emphasize that the official plays only a passive role in this setup. As we will see below, the society 
reaches the same self-consistent solution as long as every person's welfare becomes public knowledge all 
the time. The official may guarantee such information to be accessible and accelerate the coordination 
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but the official is basically assigned limited tasks compared to the original argument. Second, related to 
the first point, we do not require the division of welfare to be impartial from a certain observer's point of 
view. Our question is simply how much fulfillment one can get depending on her preference. In this sense, 
our approach differs from the impartial-division problem [S] and does not touch conceptual difficulties of 
impartiality (see, e.g., |10j). Finally, individuals are not behind the veil of ignorance. Rather, each of 
them is supposed to construct a concrete opinion about every other individual using any kind of available 
information. Although this can impose practical difficulties in a large society, it helps us avoid any 
theoretical ambiguity or conflict with the ethic of priority |14j found in the veil of ignorance |15j . 

In order to give a more concrete form to Eq. ([T]), we first consider how to measure similarity or affinity 
between distributions and then plug it into Eq. ([1}. Suppose two arbitrary distributions, p = (pi, . . . ,pn) 
and q = (qi, . . . , (/at), with pi > 0, > 0, and pi = qi = 1. We define a suitable affinity function 
Pn(p,Q) between them, whose specific functional form will be characterized by requiring the following 
four postulates [16| . First, we postulate separability, which means that one can refine affinity contribution 
from a certain bin by looking into the bin in a higher resolution without referring to the outside of the bin. 
Second, we postulate invariance under permutation, because every bin is equivalent. Third, the affinity 
should be non-negative, i.e., pn(p, q) > 0, where pn(p, q) = if and only if p is orthogonal to q, whereas 
a maximum value is obtained if and only if p = q. The distributions p and q are orthogonal when pi = 
for every non-zero qi and vice versa. Last, it should be symmetric in the sense that pn{p, q) = pjv(q, p), 
which is intuitively justified. These four postulates characterize our affinity function as 

N 

PJ v(p,q)cx^( W4 ) 1/2 , (2) 

i=l 

commonly known as the Bhattacharyya measure |17j . While details of the derivation are shown in 
Appendix, this functional form has a clear geometric interpretation: it can be viewed as a dot product of 
two vectors (y/xi, ^/xl, ■ ■ ■ \/xn) and {y/y%, y/yli ■ ■ ■ \/Un), both of which are located on an A-dimensional 

unit sphere by (v^) 2 = Si (y/Vi) 2 = 1- It therefore becomes maximized when two vectors point in 
the same direction. 

It is plausible to assume that the function F{yi , r) in Eq. ((T|) will be a non-decreasing function of 
the affinity between v.; and r, so that .F(vj,r) = T[pn{^i, r)]. Specifically, we infer that J"(vi,r) oc 
p^(vi,r), since pjv(vi,r) contains dimensionality of ^frl according to Eq. (|2|). The precise value of 
the proportionality coefficient should be determined by the normalization condition of r. For notational 
convenience, let us define s = (si, S2, ■ • ■ , sn) with Si = y/rl and Wi = (wn, u>i2, . ■ ■ , Wijv) with Wij = ^/vij- 
Equation ([TJ then leads to Si oc J2j w ij s j> 01 m a matrix form, 

s = A- 1 Ws (3) 

where W = {w^} and A is for normalizing |s| 2 , the total welfare. This formalism is reminiscent of the 
quantum mechanics, where a wavefunction ip is obtained by solving an eigenvalue problem Hip = Eijj 
with a Hamiltonian matrix T~L and its eigenvalue E. What one can measure in experiments is probability 
density |^>| 2 . An A^-dimensional matrix preserving |s| 2 is called orthogonal, and its degrees of freedom is 
the number of possible planes of rotation in N dimension, which is N(N — l)/2. Since W generally has 
A'' 2 elements and N normalization conditions, it has N(N — 1) degrees of freedom, so the magnitude of 
A will differ from one in general. 
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Results 

Two-person case 

The simplest example of W describes a situation where two persons have not ever conceived of each other 
as a society member to share welfare with. The corresponding matrix is written as 



This identity matrix does not change the input state at all, which means that the official cannot really 
coordinate these two indifferent persons' opinions in the way that we have assumed. This is actually an 
example of a reducible matrix |18j or a society that can be divided into smaller pieces: W is irreducible if 
there exists a sequence of [ki, &2, • ■ • , k n ] for any i and j such that Wik 1 x Wk 1 k 2 x • ■ ■ x Wk„j is non-zero. 
Otherwise, W is reducible. Such a reducible case is not our concern since a society is meaningful only 
when individuals interact with each other. Henceforth, only irreducible cases are considered. Then, unless 
everyone has zero self-interest, one can prove that there exists a unique stable distribution r p for every 
W by using the Perron- Frobenius theorem [18]. In other words, r p is the only stable fixed point under 
the action of W, so the official should distribute welfare as given by r p . 

Let us consider a situation where an egalitarian with vi = (1/2,1/2) meets a selfish person with 
V2 = (0.01,0.99). The corresponding matrix formulation will be 



and its stable welfare distribution is obtained by analyzing eigenvectors as r p sa (0.72,0.28). We arrive 
at this r p even if we start from r = (0.5, 0.5) for the following reason: let r be known to both the persons 
every time step. The egalitarian first feels happy to see the initial equality in r = (0.5,0.5), while the 
selfish person feels unsatisfied, which makes a difference at the next step. The drop in ri makes the selfish 
person even more upset, so her welfare continues to decrease until it reaches the stable value, T2 = 0.28. 

We now show that egalitarianism is the minimax solution of this two-person zero-sum game |19) . 
Generalizing Eq. (UJ) as 



the eigenvalue analysis yields the converged share for the first person, r±, as shown in Fig. [TJ It is a 
saddle-like shape and this person can minimize risk when she has demanded a moderate share of 1 /2 at 
the first place. The same is true for the other person as well. Although we have assumed fixed preferences 
in developing the model, if the preferences can evolve in the long run to maximize individual welfare, 
therefore, this plot shows that this two-person case will lead to an equal welfare distribution. 

In practice, a selfish person can be tempted to deceive the official by reporting a false preference to 
receive a larger share. Provided that person 2 has claimed her self-interest as a certain value V22, person 
1 can always compute the best reply On = &(i>22) by looking up the maximal share fi at the given V22 
in Fig. [T] Even if her true self-interest Vu is higher than this false v±i, she should still report in to the 
official, knowing that she cannot get better than fi in any way. When person 1 has chosen vu for this 
reason, the same consideration will lead person 2 to choose u 2 2 = &(^ii) = b[b{v22)\, an d this reasoning 
can be repeated between them ad infinitum. Such a strategic consideration eventually forces them to 
choose the egalitarian preference in common, since successive iteration of the best-reply function b drives 
every initial input V22 € [0,1) into the egalitarian fixed point, although none of the players are really 
egalitarians. 
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Egalitarianism as a Nash equilibrium 

Let us consider an iV-person case where all except one are egalitarian. That is, Vj = (1/N, . . . , 1/-/V) for 
every i ^ 1 . We observe that these N — 1 persons will have exactly the same welfare since they always 
get the same amount of affinity for any welfare distribution r. Let us thus denote every egalitarian's 
welfare as a single variable R. Recalling the separability, we find that all the elements Vu with i ^ 1 
must be the same in order to maximize person l's welfare, because her preference about the egalitarians 
should match with welfare distribution among them. Therefore, person 1 should have a preference of 
vi = {vu, V, V, ■ ■ ■ ,V) with V = (1 — vn)/(N — 1). As a consequence, the full N x N matrix calculation 
can be simplified to the following 2x2 matrix calculation 

( v^TT (N-1)VV \ A/?T\ 
( ^JTIn + (N-i)VvR \ 

\^JN + (N -l)y/R/Nj 

with a normalization condition n + (N — 1)R = 1. The stable distribution from the simplified matrix 
then yields r\ as a function of Vu, which has dri/dvn = and r\ = 1/N at vu = 1/N. In short, the best 
possible preference for person 1 is an egalitarian one. If egalitarianism is pervasive, one gets worse off by 
having another type of preference, which means that egalitarianism is a strict Nash equilibrium |20j . This 
reproduces Dworkin's paradox in mathematical terms in the sense that a non-egalitarian in an egalitarian 
society will have relatively less welfare. A difference from the original paradox is that the official cannot 
really compensate the non-egalitarian within our formulation since the welfare distribution will converge 
to the same point again as soon as the compensation is known in public. 



Inhomogeneous society 

Egalitarian preference can be still favorable even when people are all different. For instance, people 
are not equally born. Let this unavoidable inequality be described by a uniform random variable Q £ 
[—1,1]. Person i's overall political preference can be described by another uniform random variable 
fa 6 (—1/N, 1/N): for fa = 0, this person is an egalitarian. If fa > 0, she believes that the better deserve 
more, while fa < means the opposite. In addition, fa is assumed to be uncorrelated with Q. The political 
preference is then assigned as Vij — faQ + l/N, which satisfies Vij > 0. Since Q is a relative quantity, one 
can always subtract an offset value to make Q = 0, by which the normalization condition Vij = 1 
is satisfied. We can obtain the stable distribution by taking N as a very large constant and assuming 
that ri is a function of fa only. By replacing the summation in \fri(fa) cx YJ^Li \/ faCj + 1/N yj r~j (<pj ) 
by an integral, we get 

n(fa) = 2fa~ 2 [(fa + 1/N) 3 / 2 - (-fa + I/N) 3 / 2 ] 2 /(3tt + 8), 

where the proportionality coefficient is determined by the normalization condition. This n is an even 
function of fa with a maximum ri(fa = 0) = 18/(37r + 8) x 1/JV ~ 1.033/iV, implying the highest 
fulfillment for an egalitarian. One could point out that this Ti(fa) describes just one possible distribution, 
not necessarily the stable one. However, we see that the integral is positive for all fa, and thus for all i, 
and the Perron- Frobenius theorem tells us that this positivity is true only for the stable distribution [18] . 
This justifies our starting assumption that person i's welfare n is not determined by her innate part Q but 
by her political preference fa in this society It is notable that critics have said that the idea of equality 
in welfare is insensitive to individual responsibility [7]. As explained in [21], one may consider a set of 
variables characterizing an individual and classify them into two categories: the first category consists of 
innate properties such as talents that an individual is hardly responsible for. The second category, on the 
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other hand, includes choices and even some of preferences that we can connect to individual responsibility. 
If we regard Q as representing the first category while </>j as representing the second category, this example 
shows that each individual docs take responsibility for her political preference but not for her talents. 
It could be also argued that the limit N — > oo squeezes <\>i G (—1/N, 1/N) into zero so that the whole 
problem reduces to the egalitarian society above, where everyone gets = 1/N. That can be regarded as 
a first-order approximation of this problem. The calculation given here shows that an egalitarian indeed 
receives 3.3% more than in the crude approximation. 

Homogeneously unequal preference 

In all the cases considered so far, preferences could be said to be neutral on the society level in the sense 
that there is no systematic bias over the whole society. Let us now imagine that N — 1 persons with 
identical preferences, h = (h±, hi, . . . , hit), but not necessarily egalitarians. The other person indexed by 
k has another type of preference, = (v k i,v k 2, ■ • ■ jffcjv)- By the similar reasoning as in the egalitarian 
society, the situation can be simplified to the following 2x2 matrix: 

(y/v£ y/(N-l)(l-V kk )\ 

where means a summation over i excluding k. Recall that the (N — 1) persons with an identical 
preference have the same amount of fulfillment so person k should not distinguish them in order to 
maximize affinity between her preference and the welfare distribution among them. The above matrix 
means that we should only determine how to divide welfare between the person k and the other (N — 1) 
persons. The eigenvalue analysis leads to 

r k = (l-v kk )(l-v kk +X 2 /4) (6) 

with 

X = ^ -H-yf (y/^kT - Hf + 4y/h k (N - 1)(1 - v kk ) 

and H = Y."Vh t . We can differentiate Eq. (JB]) with respect to v kk to find the maximum. An easier 
alternative way is to observe from the separability that the maximum value is obtained when r k = v kk , 
because the question is how to match person fc's preference (v kk , 1 — v kk ) with the welfare distribution 
{r kl 1 — r k ), where the second elements represent the whole (N — 1) persons. By solving r k — v kk with 
Eq. ([6]) , one can get v kk maximizing r k as a solution of the following equation, 

[h k (N - 1) + H 2 ]w 4 - 2Hw 3 + (1 - H 2 )w 2 + 2Hw -1 = 0, (7) 

with w = ^Jv kk . For the egalitarian h with h k = 1/N and H = (N — l)y/l/N, substituting v kk = 1/N 
satisfies Eq. (JT)), consistently with the analysis of the (N — 1) egalitarians. We may also suppose that 
h describes an unequal distribution so that the whole society is biased in a certain way. As a specific 
example, let us assume that hi cx i 2 , that is, almost everyone wants people with higher indices to have more 
welfare. With a normalization constant, it should mean that hi = i 2 /Z with Z = N(N + 1)(2N + l)/6, 
and we thus have 

N 

H = Y^Vhi- \fh~ k = ^6N(N + 1)/(2V2N + 1) - k/Vz. 

Inserting this into Eq. we plot r k in Fig.[5J where the maximum is found at the crossing with r k = v kk . 
It is a little higher than 1/N for every k. We can do the same calculation for a more severe situation 
of inequality by setting hi cx i 4 , which again yields the same conclusion with a bit larger v kk . The 
difference between the optimal v kk and 1/N does not vanish as N — > oo whether hi cx i 2 or i 4 . This can 
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be shown by inserting Vkk = 1/A~ on the left-hand side of Eq. ((?]) and taking N — > oo, which does not 
yield zero on the right-hand side. Therefore, the person k should demand a little more for herself than 
before and distribute the remainder of the preferences equally to the others, even though they are far 
from egalitarians, in order to get the maximum welfare. 

Transient behavior 

In order to see whether egalitarians can eventually take over the society, we need to check whether the 
egalitarian preference remains as an attractive alternative when the society has both egalitarians and 
non-egalitarians with significant numbers. Let us imagine an inhomogencous society where there are 
roughly two large groups: every person in one group of size N — M occupies a high index i and believes 
that the welfare should be proportional to i 2 . On the other hand, every person in the other group of 
size M — 1 has a low index and an egalitarian preference. Our question is what kind of preference is 
good for a person on the border, i.e., with index i = M. Again, since people with identical preferences 
will get the same amount of welfare, the focal person on the border need not distinguish the members 
in each group: suppose that she wishes vmi for each member in the egalitarian group and Vmn f° r 
each member in the non-egalitarian group. The normalization condition then determines her self-interest 
vmm — 1 — {M — 1)vmi — (N — M)vmn- Depending on how she decides vmi and vmn, her final welfare 
tm will be calculated by analyzing the following 3x3 matrix, 



where the second row describes this focal person M. The conservation of the total welfare is imposed by 
setting (M — l)rjvfi + Tmm + (N — M)tmn = 1. When M is small, the maximum of tm is close to the 
egalitarian solution (vmi,vmn) = 1/N) (Fig. [5^)- It agrees with the result of the homogeneously 

unequal preferences given above since the egalitarian preference is still an absolute minority. Hence, if 
this person M can choose her own preference, the society will possibly have one more egalitarian. As 
M becomes larger, however, the situation gets different in that the maximum is located far from the 
egalitarian solution (Fig. 03). It implies that the transition process toward the egalitarian direction may 
exhibit transient behavior, instead of being smooth all the time. 

Discussion 

The theory of welfare is not an empty ideal as claimed in [7] : dealing with a society where everyone has an 
identical non-egalitarian preference, we have found that the theory recommends something very similar 
to an egalitarian preference, instead of just rubber-stamping the dominant non-egalitarian opinion. In 
addition, this finding shows that the egalitarian society is in fact the only strict Nash equilibrium. We 
therefore conclude that our analysis gives a strong support to equality of welfare by specifying which 
social and political conditions make it possible. 

On the other hand, our conclusion implies that a society can encourage egalitarianism by guaranteeing 
freedom of communication so that everyone can constantly express her fulfillment in public. In this 
respect, we can perhaps mention one of the central messages in [7] that "liberty is essential to any 
process in which equality is defined and secured." In particular, we would like to put an extra emphasis 
on the communicative aspect of the liberty. 

On a longer perspective, our results suggest an explanation of how the concept of fairness could 
develop at a certain moment in the history of evolution when human beings became able to construct 
internal expectation for the future and understand others' minds by communication. It is also worth 
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stressing that our conclusion on egalitarianism as a strict Nash equilibrium under certain well-defined 
conditions is strong enough to open further theoretical extensions and empirical tests. 
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Figure 1. Person l's welfare in the two-person case, obtained from Eq. ([5]). The curves on 
the plane show contour lines. 
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Figure 2. Equation ([6]) as a function of Vkk when hi oc i 2 for N = 10. The horizontal line shows 
rfc = 1/N. The maximum of rk is located at the crossing with the line rk = Vkk- 



10 



A 




0.1 0.1 0.2 



Figure 3. tm(vmi,vmn) obtained by solving Eq. ([8]) within a region 

< (M - l)v M i + (N - M)v M N < 1 for N = 10. (A) M = 2. (B) M = 5. The crosses show 
(vmx,v M n) = (l/N,l/N). 
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Appendix: derivation of the affinity function 



Let us consider two arbitrary distributions, p = (pi, . . . ,Pn) and q = (qi, . . . , gjv), with ^ > 0, g,- > 0, 
and J2i Pi = J2i <Zt = 1- We will define an affinity function Pn{p,Q) between them, whose specific 
functional form will be characterized by requiring the following postulates. 



• PI: Separability 



Pn 



Pi,-- 



,Pn 
,<1n 



PN-k+l 



Pk,Pk+l,- ■ - ,PN 
Qk, Qk+1, • ■ ■ , QN 



/ Pi 


Pk_\ 




Pk 1 qi 


Ik ] 


- 1 




•' Qk/ 





A(P k ,Qk) 



where Pk = pi + P2 + ■ ■ ■ + Pk and Qk = gi + 52 + • • • + Qk are positive with 1 < k < N. In 
addition, A(Pk,Qk) is a non-negative diffcrentiablc function and converges to zero as ft ^ or 
Qk 0. The situation can be described as follows: we first observe p' = (Pk,Pk+i, ■ ■ ■ ,Pn) and 
q' = {Qk, Qk+l, ■ ■ ■ , Qn), instead of the real p and q. We can calculate affinity pAr_fc+i in this 
resolution. The question is how affinity will change when we come to know the real p and q. If 
the substructures inside Pk and Qk have an exact affinity by pk = 1, nothing will change from the 
previous calculation, i.e., p^ = PN-k+l- If they actually had no affinity by pk = in this better 
resolution, on the other hand, the overall affinity should decrease by a certain amount A, which 
will be a function of Pj. and Qk- For example, for Pk <C 1 or Qk <C 1, the decrement in p^ will be 
also vanishingly small even if the subpopulations inside them look completely different. 

P2: Invariance under permutation 



P3 



Pl,P2,P3 
91,92,93 



P3 



P2,Pl,P3 
92,91,93 



P3 



P3,Pl,P\ 

93,92,91 



Note that it is for TV = 3, which is enough to prove the same property for general TV in combination 
with the other postulates. 



• P3: Non-negativity 



PN{p,q) > 0, 



where pn{p,q) =0 if and only if p is orthogonal to q, whereas a maximum value is obtained if and 
only if p = q. 



• P4: Symmetry 



p N {p,q) = PN{q,p)- 



Let us explain some direct consequences of PI. For TV = 2, it yields a trivial equality. For TV = 3, we 
have 



P3 



Pl,P2,P3 
91,92,93 



P2 



Pl,P3 
Q2,93 



A(P 2 ,Q 2 ) 



/ Pi 


P2 \ 




P2 ( P q ? 


Pi \ 
92 1 


- 1 


\Q-2 
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For N = 4, we will see two different ways to calculate affinity between (p3,j>4) and ((53,(74). The first 
way is the following: 



Pi 



Pi 



Pi 



Pi,--- ,P4 

91,..., 94 

P 3 ,P4 

<?3,94 
P3,Pl 

<?3,94 



A(P 3 ,Q 3 ) 
A(P 3 ,Q 3 ){ P2 



Pi P2 Pa 

I p 3 ' p 3 ' p 3 

P3 \ qi_ q2_ q3_ 



Pi P3 
Pi ' f 3 
Ql 93 
,Q3 ' Q3 



1 iVQa 



1 P2 

P2 ' -P2 



Ms & ) -1 

2 ' Q: 



- 1 



(9) 



We then consider another way to calculate the same quantity as follows: 

Pi 



Pl-.----.P4 



P3 



Pi 



Pi,P3,Pa 

(92,93,94 

P3,P4 
Q3,Q4 



A(P 2 ,Q 2 ) 



/ Pl_ P2_ 

F2 I 9i 92 

V Q2 ' Q2 



- 1 



A(P 3 ,Q 3 ) 



P2 P3 
<33 ' V3 . 



p^ 



- 1 



+A(P 2 ,Q 2 ) 
Comparing Eq. (|9]) with Eq. (fT0|) . we see that 



( pi £2_ 
Q2 ' Q 2 



A(P 2 ,Q 2 ). 



(10) 



(11) 



Note that P2/P3 can take any value between zero and one, regardless of P3, and that Q2/Q3 is also 
independent of Q3 in the same way, as long as P3 and Q3 are nonzero. 
Now we define 

x,l — x 



for x, y <G [0, 1]. From PI and P2, it is straightforward to verify 

g(x, y) = .9(1 -x,l- y) 



1. 



(12) 
(13) 



since 



P3 
Pi 
Pi 



Pl-.Pl-.P3 
91,92,93 

PliP3 
<?2,93 

Pl,P3 
Q 2 ,93 



= P3 

A(P 2 ,Q 2 ) 
A(P 2 ,Q 2 ) 



Pl,Pl,P3 

92,91,93 



/'2 



Pi P2 
P2 ' P 2 
Ql 12 

Q2 ' Q2 



P2 Pi 

Pi [ m.' st ) _ 1 

Q2 ' Q: 



Applying PI to another permutation 

P3 



Pl,Pl,P3 

91,92,93 



P3 



P3,P2,Pl 
93,92,91 
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we are directly led to 



9(pa, 93) + A(l - p 3 , 1 - q 3 )g 
= fl , (Pi ) gi)+A(l-pi,l-gi)g 



Pi 



9i 



1 - P3 ' 1 - 93 
P3 93 

1 - Pi ' 1 - gi 



P> T=^r = 9, 1 -P3 = r, and 1 - q 3 = s. 



where Pi,P3, gi, g3 G [0,1). Let us redefine the variables as = Pi = 

Each of these new variables p, q, r, and s can take an arbitrary value independently of one another, as 
long as p, q G [0, 1] and r,s£ (0, 1). The above equality can be then rewritten as 



g(r,s) + A(r,s)g(p,q) 
= g(pr, qs) + A(l - pr, 1 - qs)g ^) . 

Defining a function 

f(p, 9, r, s) = g(r, s) + [A(r, s) + A(l - r, 1 - a)] g(p, 9) 
for p,q,r,s <E (0,1), we apply Eq. (|T4| to /(p, q, r, s) to obtain 

f(p,q,r,s) = g(pr,qs) 

+A(1 — pr, 1 - gs)p 

Note from Eq. JlT]) that 



1 — r 1 — a 
I — pr' 1 — qs 
-A(l - r,l - s)g(p,q). 



A(p 2 +P3,92 + 93)A ( 



P3 



93 



A(l - pr, 1 - qs)A 
A(l -r,l - s). 



\P2+P3 92 + 93 

1 — r 1 — s 



1 — pr ' 1 — gs 



Hence, we see that 



f(p,Q,r,s) 

g(pr, qs) + A(l - pr, 1 - qs)g 



1 — r 1 — a 



+A(1 - pr, 1 - gs)A 



g(pr, qs) + A(l - pr, 1 - ga) 
1 - r 1 - a 



1 — pr ' 1 — 5a 

7 — Z> T — ~ ) 9(P> 9) 
1 — pr 1 — qs J 

1 — r 1 — a 



+A 



g(p, 9) 



1 — pr ' 1 — qs 
g(pr, qs) + A(l - pr, 1 
1 - p 1-q 



1 — pr ' 1 — qs 
p(l - r) g(l - s) 



1 — pr 1 — ga 



1 — gr 1 — qs 



9(r,s) 



(14) 
(15) 



(16) 



(17) 
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where the last equality comes from Eq. (fT4"|) . Equation (fT5)) then gives us 



9 



p(l-r) g(l-s) 



= 5 



1 -p l-q 



1 — pr 1 — qs 
relating Eqs. (fT6]) and (|17[) as follows: 

f(p,<l,r,s) 
= g{pr, qs) + A(l — pr, 1 — gs 
1 — r 1 — s 



1 — pr 1 — qs 



1 — r 1 — s 



+ A 



1 — pr I — qs 
gipr, qs) + A(l - pr, 1 - gs) 
1-p 1 -g 



1 — pr 1 — gs 



1 - p l-q 



1 — pr 1 — gs 



1 — pr 1 — gs 



g(r,s) 



In short, we have just confirmed that 

f(p,Q,r,s) = f(r,s,p,q). 

From the definition of f(p, q, r, s) in Eq. (|15[) . we see that 

g(r, s) _ A(r, s) + A(l - r, 1 - s) - 1 
5(P,«) ~ A(p, g) + A(l-p,l- 9 )-l' 

For this to be true for every independent set of (p, g, r, s), it must be that 

g(r, s) = C[A(r, S )+A(l-r,l- S )-l] 
with a certain constant C. Then PI says that 

lim g(r, s) = — C, 

S->1 

which is the lowest possible value of g(r,s). However, according to the definition of g(r,s) in Eq. (|T^|) 
and P3, this lower bound should be —1. In other words, C = 1 and 



l>2 



r,l — r 
s, 1 — s 



A(r, s) + A(l - r, 1 - s). 



We have restricted our variables as r € (0, 1) and s G (0, 1), but it is not difficult to check that the above 
expression can be readily used in r £ [0, 1] and s € [0, 1] when A is characterized as will be discussed 
below. Then PI works as a recursive relation, yielding an expression for general ./V: 



f>N 



^A(pi,gi). 



(18) 



i=i 



Now we ask ourselves how A should look. PI and P4 imply that A can be expanded as A(pi,qi) = 
ci{piqi) ai + C2{piqi) a ' 2 + • ■ ■ with coefficients c„ and exponents a n > 0. So let us write 



N oo 



PN 



= 1 n=l 
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If we define l a (p, q) = Yli=i {Pi1i) a i it means that pn should be of the following form, 

oo 

PN(p,q) = ^2cJ an (p,q). (19) 

n=l 

Among every possible l a , it is only that satisfies the maximization condition in P3. This can be 
shown by variational calculus using a Lagrange multiplier p, 

= ap 2 ^- 1 -n = 0, 

?i =Pi 

which is satisfied for any pi when a — 1/2 and p = a. Hence we separate this term from the others in 
Eq. (Ull) as 

p N (p,q) = d 1/2 (p,q) + 2^ c n l an (p,q), 

n 

where means that a n = 1/2 is excluded from the summation. There exists a maximum value in p^, 
which is obtained by 

Pn(p,p) = cl 1/2 (p,p) +^2 (p,p) 

n i 

noting that l\/i{p>p) = TliPi = 1- Since this value is the same for every p, the last term should be kept 
constant for any pi, and we have to choose c„ = for every n in the second term. To sum up, our affinity 
function is characterized as 

JV 

p N (p,q) = c^{piqi) 1/2 

i=l 

with a positive constant c, the maximum value of pn- 



d_ 

dqi 



N 



L(P, q)-p^2 



